Race- and Ethnicity-Related Differences in Heart Failure With Preserved Ejection Fraction Using Natural Language Processing

Background Heart failure with preserved ejection fraction (HFpEF) is the predominant form of HF in older adults. It represents a heterogenous clinical syndrome that is less well understood across different ethnicities. Objectives This study aimed to compare the clinical presentation and assess the diagnostic performance of existing HFpEF diagnostic tools between ethnic groups. Methods A validated Natural Language Processing (NLP) algorithm was applied to the electronic health records of a large London hospital to identify patients meeting the European Society of Cardiology criteria for a diagnosis of HFpEF. NLP extracted patient demographics (including self-reported ethnicity and socioeconomic status), comorbidities, investigation results (N-terminal pro-B-type natriuretic peptide, H2FPEF scores, and echocardiogram reports), and mortality. Analyses were stratified by ethnicity and adjusted for socioeconomic status. Results Our cohort consisted of 1,261 (64%) White, 578 (29%) Black, and 134 (7%) Asian patients meeting the European Society of Cardiology HFpEF diagnostic criteria. Compared to White patients, Black patients were younger at diagnosis and more likely to have metabolic comorbidities (obesity, diabetes, and hypertension) but less likely to have atrial fibrillation (30% vs 13%; P < 0.001). Black patients had lower N-terminal pro-B-type natriuretic peptide levels and a lower frequency of H2FPEF scores ≥6, indicative of likely HFpEF (26% vs 44%; P < 0.0001). Conclusions Leveraging an NLP-based artificial intelligence approach to quantify health inequities in HFpEF diagnosis, we discovered that established markers systematically underdiagnose HFpEF in Black patients, possibly due to differences in the underlying comorbidity patterns. Clinicians should be aware of these limitations and its implications for treatment and trial recruitment.

H eart failure with preserved ejection fraction (HFpEF) accounts for approximately half of all cases of heart failure (HF), 1,2 with an increasing prevalence among older adults and ethnic minority groups. 3HFpEF is considered to be a clinically heterogenous syndrome, posing a diagnostic challenge. 15][6][7] However, its applicability across diverse ethnicities has yielded mixed results 8 ; in a cohort of 233 Asian adults with a clinical diagnosis of HF and left ventricular ejection fraction (LVEF) $50%, the H 2 FPEF score had a sensitivity of 24.9%. 8Health equity in HFpEF management requires diagnostic tools with high efficacy in all patient groups.This clinical need has increasing urgency with recent trial data demonstrating prognostic benefit in HFpEF treated with sodium glucose cotransporter-2 inhibitors (SGLT2i). 9,10distinct concern arises in the disproportionate impact of HFpEF on Black patients, who are affected at a younger age, possibly related to their higher prevalence of key HFpEF risk factors including hypertension, obesity, and diabetes. 3However, despite these disparities, HFpEF research and trials have predominantly investigated White populations. 7,11,12us, less is known about the relationship between race/ethnicity and HFpEF, with resultant potential for health inequity.
We applied a validated Natural Language Processing (NLP) algorithm to the electronic health records (EHRs) of a large central London secondary and tertiary care hospital, to identify all HF cases that fulfilled the full European Society of Cardiology (ESC) 2021 HFpEF diagnostic criteria. 13This novel approach, using artificial intelligence (AI) to extract undiagnosed HFpEF cases from the free-text portion of EHR, thereby acquiring a more detailed clinical picture than that from structured data and aiming to mitigate bias associated with a clinician assigned diagnosis, including the scale and harms of undiagnosed HFpEF as outlined in our recent study. 14Here, we utilize this cohort of HFpEF patients to examine racial and ethnic disparities in the performance of existing diagnostic markers (H 2 FPEF score and N-terminal pro-B-type natriuretic peptide [NT-proBNP]) and linking these to underlying comorbidities and survival outcomes.This includes clinical signs and symptoms of HF (as indicated above by "heart failure" in the clinical notes), an LVEF $50%, and "evidence of cardiac structural and/or functional abnormalities consistent with the presence of LV diastolic dysfunction/raised LV filling pressures" 13 (Figure 1).The latter includes one of: raised NT-proBNP $125 pg/mL ($375 pg/mL with atrial fibrillation [AF]), LV mass index $95 g/m 2 (female) 115 g/m 2 (male), relative wall thickness >0.42, left atrial volume index >34 mL/m 2 (>40 mL/ m 2 with AF), Doppler echocardiographic E/e' (ratio of early diastolic mitral inflow velocity to mitral annulus relaxation velocity) ratio >9, pulmonary artery systolic pressure (PASP) >35 mm Hg, or tricuspid regurgitation velocity >2.8 m/s. 13This methodology allowed us to identify patients who met the ESC criteria even if they have not been given a formal diagnosis of HFpEF ("Undiagnosed HFpEF").[BMI]).Self-reported race and ethnicity were obtained from clinical coded data.When reporting race and ethnicity, we adhered to the latest scientific guidance on their reporting. 19Only terms predating the first HF mention were included as comorbidities.
Baseline characteristics were reported according to race and ethnicity groups.Hospitalizations were determined from discharge summaries.Postcodes were cross referenced with The English Indices of Deprivation 2019 statistics to calculate an Index of Multiple Deprivation (IMD) score as a surrogate for socioeconomic status. 20Ethnicities were compared according to the proportion in the lowest IMD quintile.H 2 FPEF scores (0-9) were calculated and
Baseline characteristics stratified by ethnicity and diagnostic group (clinician-assigned vs NLPidentified ESC criteria patients) are available in Supplemental Table 1.
HF medication use was analyzed, showing no significant differences in loop diuretic use between ethnicities (P ¼ 0.18) (Supplemental Table 2).2).

DISCUSSION
There are several important findings from this race and ethnicity analysis of HFpEF.Firstly, we demonstrated through the use of NLP to assemble a cohort of patients meeting the ESC 2021 HFpEF diagnostic criteria that the H 2 FPEF score underdiagnosed HFpEF in Black patients relative to White patients in our cohort (44% vs 26%).Secondly, we also confirmed in a diverse population that the comorbidity profile of HFpEF patients varies significantly based on ethnicity (Central Illustration).Specifically, Black patients are younger and more likely to have metabolic comorbidities, while White patients are older and had a higher prevalence of AF and CAD.Finally, White patients with HFpEF experienced higher mortality than Black patients.
HFpEF is a heterogenous syndrome with distinct clinical phenotypes reflecting a varying prevalence of comorbidities. 226][27][28][29] That Black patients with HFpEF are younger and have more hypertension, obesity, and diabetes.These distinct comorbidity profiles may have pathophysiological implications that influence HFpEF development and survival.[32] Beyond pathophysiology, these findings have important diagnostic implications.We found that in a HFpEF cohort selected using the ESC diagnostic criteria, Black and Asian patients had statistically significantly lower H 2 FPEF scores and were less likely to be categorized as likely HFpEF (ie, H 2 FPEF $6).
This disconnect between the ESC criteria, which is based on biochemistry and echocardiography values, and the H 2 FPEF score is likely due to the greater emphasis on comorbidities in the H 2 FPEF scoring system, particularly AF, as has been noted in other cohorts. 33The weighting given to AF in H 2 FPEF (the presence of paroxysmal or persistent AF scoring three out of nine possible points) becomes highly relevant in groups with low rates of AF, such as Black patients in our HFpEF cohort (only 13% compared to an overall

Race and Ethnicity Differences in HFpEF Diagnosis
A U G U S T 2 0 2 4 : 1 0 1 0 6 4 prevalence of 24% in our HFpEF population and 34% in the original H 2 FPEF HFpEF cohort) (Figures 2 and 3A). 7is may explain why H 2 FPEF has limited diagnostic ability in our Black patient cohort (Figure 3B).
Previous studies also demonstrated variance in the sensitivity of H 2 FPEF in diagnosing HFpEF dependent on the background rate of AF, with H 2 FPEF performing the poorest in populations of younger patients with less AF and better in the opposite groups. 8,34We note that our findings were consistent even when considering only those patients with a clinician-assigned formal diagnosis of HFpEF and so cannot be solely attributed to differences between the populations captured by the ESC criteria and H 2 FPEF scores.The reduced clinical applicability in diagnosing Black patients with HFpEF, who already have lower NT-proBNP values, 35 has consequences for trial recruitment and the treatment.Especially now SGLT2i are guidelinerecommended therapy. 36e diagnosis of HFpEF continues to be a challenge.Each diagnostic algorithm or criteria has a different focus, incorporates different variables and thus may preferentially diagnose different HFpEF phenotypes.The ESC criteria acknowledge this and are purposefully broad to enhance their applicability, particularly when other diagnostic aids demonstrate consistent discordance in their categorization of HFpEF and vary widely in their sensitivities. 6,8,21,33,37This highlights the issue of currently having one HFpEF definition that may encompass a variety of phenotypes that differ by ethnicity and prognosis.We used the current gold standard ESC criteria with its broader applicability, but future research will need to address this diag- We found that Black patients in our cohort had a lower mortality compared with White patients.This has been demonstrated elsewhere by other groups, 25 even when adjusting for comorbidities. 26In our cohort, when adjusted for factors including age and high NT-proBNP levels, the difference in survival lost  Strengths of our study include the use of NLP AI methods for case identification, a technique that has been validated previously by our group, 14 and others. 38Furthermore, by applying the ESC HFpEF diagnostic criteria as our inclusion criteria, we ensured that our patients met an internationally recognized definition of HFpEF.Additionally, collation of a cohort of this type would traditionally be performed by manual review, which itself is an imperfect process, difficult to standardized and open to personal biases.The review of over 1 million patients would not have been possible manually and underlines the utility of our approach using NLP.Finally, by being performed in the National Health Service, where health care is free at the point of use, this study is likely less impacted by systemic differences such as payer biases.
STUDY LIMITATIONS.Firstly, this was a single-center retrospective study that relied on accurate free-text clinical documentation.Additionally, we were unable to obtain the cause of death so could not explore differences between cardiac and noncardiac death.
Most importantly, we acknowledge that race and ethnicity are social constructs without scientific meaning and thus the grouping of race and ethnicity should be done via self-identification using the broadest possible language.While certain groups display disproportionate burdens of disease, a large proportion of this may reflect systemic disparities in the social determinants of health.A limitation, therefore, is our lack of ability to subcategorize race and ethnicities, which risks the oversimplification of findings attributable to one race or ethnicity.
METHODSSTUDY COHORT.An NLP tool was used to establish a single-center retrospective database of adult patients with a diagnosis of HFpEF from EHR at King's College Hospital National Health Service Foundation Trust between 2010 and 2022, as detailed in our recent study. 14The study operated under London South-East Research Ethics Committee approval granted to the King's Electronic Records Research Interface, which did not require written informed patient consent.This study complies with the Declaration of Helsinki.Briefly, patients were included, whether inpatient or outpatient, if there were two or more mentions of a diagnosis of "heart failure" in the clinical notes determined by NLP. 14,15Next, variables were extracted from echocardiogram reports.Patients were only included for assessment if they had a LVEF $50% recorded within a year of the first HF mention and excluded if at any point their LVEF was <50%.Patients were also excluded if there was an alternative diagnosis documented including hypertrophic cardiomyopathy, restrictive cardiomyopathy, constrictive pericarditis, cardiac amyloidosis, or severe valvular disease.Patients were included if they met the ESC criteria for a diagnosis of HFpEF.

" 4
Confirmed HFpEF" or a clinician-assigned diagnosis of HFpEF was identified from NLP using free-text documentation of "HFpEF" mentions in the clinical notes.For confidence in the NLP-based retrieval, manual validation of the presence of "HFpEF" mentions was performed on 100 randomly sampled (49 of whom were White, 35 were Black, 10 were Asian, and A B B R E V I A T I O N S A N D A C R O N Y M S AI = artificial intelligence CAD = coronary artery disease CKD = chronic kidney disease ESC = European Society of Cardiology HFpEF = heart failure with preserved ejection fraction IMD = Index of Multiple Deprivation LVEF = left ventricular ejection fraction NLP = Natural Language Processing NT-proBNP = N-terminal pro-B-type natriuretic peptide PASP = pulmonary artery systolic pressureBrown et alJ A C C : A D V A N C E S , V O L . 3 , N O .8 , 2 0 2Race and Ethnicity Differences in HFpEF DiagnosisA U G U S T 2 0 2 4 : 1 0 1 0 6 4 5 did not have a recorded ethnicity) NLP identified clinician-assigned HFpEF patients (ie, "Confirmed HFpEF") with a 100% true positive rate.DATA PROCESSING.SNOMED-CT (Systematized Medical Nomenclature for Medicine-Clinical Terminology) concepts were extracted from EHR using the validated MedCAT (Medical Concept Annotation Tool) 16 and MedCATTrainer 17 NLP tools available from the CogStack ecosystem.NLP extraction and data processing have been detailed in our recent study. 14The performance of the MedCAT NLP pipeline in detecting comorbidity mentions has been previously assessed on 5,617 annotations from 265 documents, and the F1 (harmonic mean of precision and recall) measures were all >0.85. 18External validation of the cohort has been previously performed and demonstrated a similar pattern of clinical characteristics and mortality differences between the Confirmed HFpEF and Undiagnosed HFpEF cohorts. 14STUDY VARIABLES.The SNOMED-CT concepts extracted from the clinical notes included demographics (age, sex), comorbidities (diabetes mellitus type 2, hypertension, chronic kidney disease [CKD], AF, myocardial infarction, coronary artery disease [CAD], stroke, and transient ischemic attack), medications (calcium channel blockers, betablockers, loop diuretics, mineralocorticoid receptor antagonists, angiotensin-converting enzyme inhibitors, angiotensin receptor blockers, SGLT2i, and insulin), and laboratory values (NTproBNP, BNP, hemoglobin, creatinine, HbA1c, and body mass index

FIGURE 1
FIGURE 1 Cohort Overview

J
A C C : A D V A N C E S , V O L . 3 , N O .8

FIGURE 4
FIGURE 4 Survival Outcomes nostic uncertainty and find ways to understand HFpEF better for each individual.Developing a more comprehensive understanding of the diagnosis of HFpEF across race and ethnicity should lead to better health equity by improving clinical trial diversity.In the meantime, clinicians should be cautious as to the potential pitfalls on relying on a single scoring system to diagnose HFpEF in Black patients and instead use them in conjunction with other clinical and laboratory findings to reach an accurate diagnosis.

FIGURE 5
FIGURE 5 Cox Regression Analysis of HF Markers

J
A C C : A D V A N C E S , V O L . 3 , N O .8 Race and Ethnicity Differences in HFpEF Diagnosis significance suggesting that the reason for the lower mortality relates to HFpEF being less advanced in these patients.

4
CENTRAL ILLUSTRATION Race and Ethnicity-Related Differences in HFpEF Diagnosed by Natural Language Processing Brown S, et al.JACC Adv.2024;3(8):101064.Piechart (central) shows the number of patients diagnosed with HFpEF according to the european society of cardiology diagnostic criteria stratified by self-reported ethnicity.(A) Shows a bar chart for the number of patients with H 2 FPEF score $6 by ethnicity, with P values.(B) Bar chart depicting the median NT-proBNP values for each ethnic group, along with their IQRs.ESC ¼ European Society of Cardiology; NT-proBNP ¼ N-terminal pro-B-type natriuretic peptide.Brown et al J A C C : A D V A N C E S , V O L . 3 , N O .8 , 2 0 2 Race and Ethnicity Differences in HFpEF Diagnosis A U G U S T 2 0 2 4 : 1 0 1 0 6 4

TABLE 1
Baseline Patient Characteristics Values are median (IQR) or n (%). a Kruskal-Wallis rank sum test; Pearson's chi-square test.BMI ¼ body mass index; IMD ¼ Index of Multiple Deprivation; TIA ¼ transient ischemic attack.

TABLE 2
Heart Failure Diagnostic Markers and Laboratory Values